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(54) Abstract Title 

Remotely assessing which of the software modules installed in a server are active 

(57) In a distributed data processing system there are plural application servers each having a database of 
executable data management tasks and an initialisation list Indicating which of these tasks should be active. A 
server can be probed to determine whether It has successfully initialised all the tasks in its initialisation list. 
The probed server can be instructed to initialise tasks that have failed to become active. Monitoring of network 
messages and integrity of e-mail routes are also disclosed, as is checking the replication of database changes 
between servers. 
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SERVER PROBB METHOD AND APPARATPS FOR A DISTRIBOTBD DATA 

PROCESSING SYSTEM 

The present invention relates to a method and apparatus for probing 
remote application servers in a distributed data processing system. 

Some conventional data processing environments comprise a plurality 
of user terminals connected to a central host data processing system. 
Such data processing environments are typically referred as central or 
host environments . 

Increasing in popularity are distributed data processing 
environments in which user terminals are connected to plural server data 
processing systems. 

In both of the above examples, the cost of systems management can 
be. measured by the ritio of administrators (or operation support staff) 
to users. In a typical distributed environment, such as an environment 
providing a Lotus Notes service or similar distributed client-server 
database application, the ratio is relatively.: high: one Lotus Notes 
(Lotus and Lotus Notes are trade marks of i;otus Development Corporation) 
administrator i^y have dif f iculty controlling Wer 200 users of a fully 
functional; Lotus Notes service. By coft$>arisonr in a typical host 
environment such as an Off iceVision (Of f iceVision is a trade mark of 
international Business Machines Corporation), a single administrator may 
comfortably control thousands of users. 

In a typical distributed environment employing a distributed 
database management system, a group of administrators collectively 
perform operational tasks associated with management of servers such as 
Groupware and E mail servers. Both E Mail and Groupware applications 
usually generate megabytes of information during normal daily operation. 
The information is typically stored in a log format. The logs are 
preferably processed with a view to identifying error conditions and thus 
to eliminating or at least reducing application server failures. However, 
the processing of such logs is a laborious activity, it would be 
therefore be desirable to improve automation of server management in a 
distributed environment. 

in accordance with the present invention, there is now provided 
server probe apparatus for a distributed data processing system having 
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plural application server computer systems interconnected via a network, 
each application server having a database application including a 
plurality of executable data management tasks and an initialisation table 
storing a task configuration indicative of which ones of the tasks should 
5 be active, the apparatuis coiiprising: means for reading the task 

configuration from the initialisation table from a target one of the 
application servers; means for identifying the tasks which are active in 
the target one of the application servers; and means for generating an 
event message if active tasks identified differ from the tasks specified 
10 in the initialisation table. 

The generating means preferably comprises restart means for making 
successive attempts to restart tasks which are specified in the 
initialisation table but inactive in the target one of the application 
servers and means for generating the event message after a predetermined 
plurality of failed attempts by the r'estart means to restart the or each 
inactive task. 
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In preferred embodiments of the present invention, there is 
. additionally provided an administration terminal having means for 
displaying the event messages. 

It will be appreciated that the lireseht invention extends to a 
distributed data processing system having plural application server 
computer systems interconnected via a network, each application server 
having a database application including a plurality of executable data 
management tasks and an initialisation table storing a task configuration 
indicative of which ones of thW tasks should be active, and server probe 
apparatus as hereinbefore described. 

Viewing the present invention, from another aspect, there is now 
provided a method for managing data management tasks in a distributed 
data processing system having plural application server computer systems 
interconnected via:a network, each application server having a database 
application including a plurality of executable data management tasks and 
an , initialisation table storing a task configuration indicative of which 
ones of the tasks should be active, the method coin^rising the steps of, 
reading the task configuration from the initialisation table from a 
target one of the application servers; identifying the tasks which are 
active in the target one of the application servers; and generating an 
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^eyent message if active tasks identified differ from the tasks specified 
in the initialisation table. . 

Preferred embodiments of the present invention will now be 
described. . by way of example only, with reference to the accompanying 
drawings, in which: 

Figure I is, a block diagram of a distributed data processing 
system; 

Figure 2 is more detailed block diagram of the data processing 
system of Figure 1; 

Figure 3 is a block diagram of a DSM server of the system shown in 
Figure 2 ; 

Figure 4 is a. block diagram of, an application server of the system 
shown in Figure 2; 

Figure 5 is a block diagram of a high level architecture for the 
20 DSM server; 

Figure 6 is a block diagram of software stored in an application 
server of the system shown, in Figure 2; .: ' , 
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, ; , ., Fig**f e 7 is. a, fpictional block diagram of -.the -DSM server. 

server in the form,. of a flpw chart; . : r 

Figure 9 is a block diagram of a mail probe function of the DSM 
server in the fori?i,.p^ a., flow, r^art; , and, V - . 



Figure 10 is a, block diagr^.; of. another distributed data processing 
environment embodying the present invention. • 

, first, to Figure 1, a distributed data processing system 

, .;^"*^^v-^^^^ P^«=«"t invention, comprises, a plurality of application 
. ,.^^.r": ^°"P"^^r .^y^^^ 40-70 and a Distributed Systems Monitor (DSM 

server computer system. 10 all interconnected via a network 5 

-40 ..... 



With reference now to Figure 2, each application server 40-70 
provides a service to a set of client user terminals 90-93. The DSM 
server iO. The DSM server 10 is also connected to an administration 
terminal 3 0. 

Referring to Figure 3, the DSM server 10 comprises a system random 
access memory (RAM) 200. a system read only memory (ROM) 210, a central 
processing unit (CPU) 220, a mass storage device 230 comprising one or 
more large capacity magnetic disks or similar data recording media, one 
or more removable storage means 240 such as floppy disk drives. CD ROM 
drives and the like, a network adaptor 250, a keyboard adaptor 260, a 
pointing device adaptor 270, arid i display adaptor 280, all 
interconnected via a bus architecture 290. The CPU 220 is a Pentium 
lOOMHz central processor (Pentiuni is a trade mark of Intel Corporation). 
It will be appreciated that other emixidiments of the present invention 
may employ an equivalent to a Pentium lOOMHz CPU to perform the function 
of CPU 220. The RAM 200 is at least 4d^ megabytes in capacity. A keyboard 
. 300 is coupled to the bus architecture 290 via the keyboard adaptor 260. 
Similarly, a pointing device 310, such as a mouse, touch screen, tablet, 
tracker ball or the like, is coupled to the bus architecture 290 via the 
pointing device adapt.=r 270. Equally, a display output device 320, such 
as a cathode ray tUbe ^(CRT) -^di^^^ ^^^^j ^^^^^ 

or the like, .=.is coupled to the bus architecture '29 o'via^he display 
adaptor 280. Additionally, the DSM server 10 is coupled to the terminal 
■20 and the. servers 40-70 via the netWork adaptbr 250. 

Basic . input output^ system (BIOS) software is stored in the ROM 210 
for enabling data communications between the' CPu" 2^6; 'mass storage 230 
. RAM -200, ROM 210, removable storage 240. ahd adaptors 250-280 via the bus 
architecture: 290. Stored on the mass storage device 230is operating 
system software and application software including DSM software. Further 
application software may be loaded into the DSM server 10 via the 
removable, storage 240 or the network adaptor 280. The operating system 
software enables the^ DSM server 10 to select and run ^the application 
software. .The; application software stored 'ifi the DSM server 10 includes 
Lotus Notes. Release 4 . - LotUs Notes 4' is a docunient -based database 
management system. Further details of Lotus NotesU can be found in 
Mastering Lotus Notes 4 by Brown. Brnwn v»,. n chouk and nubl^.>,.H 
xn 1996 by ,c;yv>pv, Tnr . As will be described shortly; in operation, the 
DSM server. 10 employs Lotus Notes 4 to communicate with the application 
server 40-70. : : 
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It will be appreciated that, in some embodiments of the present 
invention, terminal 20 may be integral to the DSM server 10, with 

control functions of the terminal 20 facilitated via the 
display 320 and the input devices 300 and 310 of DSM server 10. 

Referring to Figure 4, each application server 40-70 coit^rises a 
system random access, memory (RAM) 700, a system read only memory (ROM) 
710, a central processing unit (CPU) 720, a mass storage device 730 
comprising one or more large . capacity magnetic, disks or similar data 
recording media, one or more removable storage means 740 such as floppy 
disk drives, CD ROM drives and the like, a network adaptor 750, a 
keyboard adaptor 760, a pointing device adaptor 770, and a display 
adaptor 780, all interconnected via a bus architecture 790. The CPU 720 
may be an Intel Pentium lOOMHz central processor or equivalent. A 
keyboard 800 is coupled to the bus architecture 790 via the keyboard 
adaptor 760, Similarly, a pointing device 810, such as a mouse, touch 
screen, tablet, tracker ball or the like, is coupled to the bus 
architecture 790 via the pointing device adaptor 770. Equally, a display 
output device 820, such as a cathode ray tube (CRT) display, liquid 
crystal display (LCD) panel, or the like, is coupled to the bus 
. architecture 790 via the display .adaptor 730,. Additionally, the each 

server 40-70 . is coupled to, the DSM server. 10 and to remote 
client terminals 90 yia the, network adaptor 750. : ; ' . 

Basic input output system (BIOS) software, is stored in the ROM 710 
for enabling data communications between the CPU 720, mass storage 730, 
RAM 70{), ROM 710,. removable storage 74.0.. and adaptors 750-780 via the bus 
architecture 790. stored on the mass storage device 730 is operating 
system software and application software. The application software 
includes a. distributed client-server database application such as Lotus 
Notes 4 and Lotus cciM^il, In operation, each application server 40-70 
employs the resident client -^eryer database application to communicate 
with both the remote c^ent terminals .90-93 and the DSM server 10. 
Further application software may be loaded int6 each application server 
40-70 via the removable storage 740 or, the network adaptor 780 . In 
operation, the operating system software enables: each application server 
40-70 to select and. run the, application- software. 

Referring ^ck to Figure 2, the application server. 40 is a cc:Mail 
server running Lotus cc:Mail. on .the. OS/2 operating system platform (OS/2 
is a trade mark of International Business Machines. Corporation) produced 



by International Business Machines Corporation to provide cc:Mail 
services to users of the connected client terminals 90. The application 
server 50 is a Notes server running Lotus Notes 4 on the Windows Server 
NT operating system (Windows and Windows NT are trade marks of Microsoft, 
Inc) produced by Microsoft Inc to provide Notes services to usfers of the' 
connected client terminals 91. The application server 60 is a Notes 
server running Lotus Notes 4 on the OS/2 operating system to provide 
Notes services to users of the connected client terminals 92. The 
application server 70 is a Notes server running Lotus Notes 4 on the UNIX 
or AIX operating systems (UNIX is licensed exclusively through X/Open 
Company Limited; AIX is trade mark of International Business Machines 
Corporation) to provide Notes services to users of the connected client 
terminals 93. It will be appreciated that, in other embodiments of the 
present indention, .there may be more or less than four application 
servers operating on any orie or more of the aforementioned or different 
operating system platforms. 

DSM Server; General 

It will be appreciated from Figure 2 that the DSM server is located 
in terms of system. hierarchy, between the application servers 40-70 and 
the administration terminal 20. In operation the DSM Server operates as a 
mid-level systems jnanager. In operation, the application servers 40-70 
record data. transfers in which they are involved! such as message and E 
Mail to or from the connected client terminals 90-93. in log files. The 
log files maintained by the application servers 40-70 are directed to the 
DSM server. 10. The DSM' server 10 processes the received log files to 
reduce the amount of reporting information sent to the administration 
terminal 20. Provided that the application servers 40-70 can route such 
log files to the DSM server 10, and, in the case of Notes server 50-70 
employ the Notes communication protocol, the operating system platform' is 
■not relevant.: 

. Referring now to Figure 5. the hi^h level architecture of the DSM 
server 10 comprises a first layer 11 for^ performing Process. Action. 
Notif iv and Report functions - Below the first liyer ' 11 is a second 
function layer 12 tot performing Log. Analyze and filter functions. Below 
the second layer 12 is a Lotus Notes layer 13. In operation, the Notes 
layer 13 enables the DSM server 10 to communicate with the application 
servers 50-70. To facilitate such communication, the Notes layer 13 
includes. a Notes mail message transfer agent (MTA)' 14, a cc:Mail MTA 15 



^ .^^""P^^ Message Transfer Protocol (SMTP) mail MTA 16, and an X.400 mail 

MTA (not shown) . The message transfer agents avoid the need to include 
special mail gateways to communicate with mail systems which are foreign 
to Notes. Below, the Notes layer 13 is a network layer 17 for interfacing 
the DSM monitor 10 with the application servers 40-70, The mass storage 
230 comprises a Notes data store 21. a cc:Mail store 22. and an archive 
data store 23. Data from the application servers 40-70, such as STATREP 
and LOG.NSF files 110 or other mail system log files 100. is received in 
the DSM server 10. at the network layer 17 and passed via the MTAs 14-16 
of the Notes layer 13 to the second layer 12 where it isprocessed by the 
Log, Analyze, and Filter functions. The Log function records incoming 
data in the mass storage. 230 . The. filtered data . is passed from the second 
layer 12 to the first layer 11 wheire the data is processed by the 
Process, Action, Notify and Report . functions .The Action function may 
generate, in response to the received data, corrective instructions 81 
which are returned to the application server 40-70. Depending on system 
configuration a delay may be imposed in the passage of data from the 
second layer 12 to the first layer 11 in relation to one or more of the 
functions therein. For example, in some embodiments of the present 
invention, the Report function may be set to activate only once a week, 
with the data to report^ed remaini;ig logged in the mass- storage 230 until 
^- "^r^ ' "^^^ ^^^^ architecture . of: each of the appl icat ion 

servers 40-70 comprises a man, ,laye^ , .4^ cc:Mail 
functionality on an OS/2, NT or ,AIX. operating .system' platform as the case 
may be. Below the mail l^yer 41 is a log -iayer : 42 for supplying log files 
to the pSM^server.p. Bpipy, the log layer .:4 2. is :a :net work layer 43 for 
^""^l^w^"? ''^''^ 5^® "^.^^^^^^ layer 17; of the DSM ■server 10 . 

Lotus C ollection Agent 

Referring to Figure. 6, each. Notes appLirca't i6n server 50-70 runs 
Notes 910 on an operating system 900 such as AIXv .03/2. or NT operating 
system, m each Notes application server 50-70. Notes 910 comprises a 
NOras.INI file ,911. ^and a plurality of Notes tasks. M3-916. The tasks 913- 
916 include a . Router task 913 , , a. Replicator ta^ 914. and a Reporter task 
915. in .addition, each Notes, application s.erver' 50v70 includes a Notes 
.=°^^.^=^>,°^.*.9«'^^9.12. The Notes , collection ^.gent.= 912 operates as a task 
withxn. Notes 910 . The NOTES, INI . file defines, the tasks which are to be 
started in Notes 910 when the host Notes : application server 50-70 is 
booted. The Notes collection .agent 912 is specified within the NCm:s INI 
file. Thus, the Notes .collection agent. 912 is active whenever the host 
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application servers 50-70 is operational. When active, the Notes 
collection agent 912 enables the DSM server 10 to communicate with the 
Notes applications servers 50-70 via the Notes protocol. The Notes 
application servers 50-70 send their respective operational statistics to 
a database file called LOG.NSF. At definable intervals, the Notes 
collection agent 912 copies information from the LOG.NSF to an 
intermediate database, from which the information is formatted and mailed 
via the Notes layer 41 to the DSM server 10 as represented by 110 in 
Figure 5. The information collected from each Notes application server 
50-70 falls into one of four categories: log data, server tasks, e mail 
routing, and replication. For each category, the user can select one of 
the options: 

1) Collect and process all documents in that category (default) ; 

2) Collect all documents- of this type now. but process them later; 
and, "■ • ■' 

3) Discard all document^ of this category. 

The user can change the action for any document category at any 
time. The Not^s collection agent 912 maintains a time and date stamp to 
record the last successful poll of the LOG.NSF file. This stamp is 
recorded in: the N0TES:INI £ile 91i: iJ one of the tasks 913-916 is 
started with either no or an invalid time stamp, the Notes collection 
agent 912 will create it before processing any data. The parameters used 
by the Notes collection agent 912 dan be viewed and configured via each 
of the Notes servers 50^70; Statistics can also be recorded in the 
STATREP database and routed to the DSM seirver 10 for processing as also 
represented by mail -flow llO in Fi^re 5.' '■' 

cc:Mail Collection Agent 

. Application server 40 acts as a cc :Mail Post' Office router . The DSM 
server- lO appears to application server 40 as a peer Post office via the 
cc:Mail MTA 16. Application server " 40 -fiirther comprises a"cc:Mail 
collection agent. The cc.Maii collection' ^agent and the cc:Mail MTA 16 in 
combination enables cc:Mail Ibg files to be mailed by application server 
40 to the DSM server 10. The cc::Mail collection agent is similar in 
function to the Notes collection agent 912 hereinbefore described. In 
operation, the cc:Mail MTA 16 and cc:Mail collection agent cooperate in 
gathering oc -Mail router ibg data from application server 40 and in 
routing such log data to the DSM server 3 0 without interrupting normal 



Router function. This process operates as a cc:Mail call list entry 
through which cc:Mail logs are collected and supplied to the DSM server 
10 at predefined intervals. This enables the DSM server 10 to process 
these log files off-line from the cc:Mail message router service provided 
by application server 40. 

DSM server Functions 

Referring now to Figure 7, in operation, the DSM server 10 acts as 
a mid-level system manager for managing the activities of the application 
servers 40-70. To facilitate such management, the first layer 11 of the 
DSM architecture includes the following Process functions: 

a) a Monitor function 540; 

b) a Server Probe function 550;. 

c) an E Mail Probe function 560; and 

d) a Database Replication Tracking function 570. 

The functions 540^570. are performed by the CPU .220 when configured 
by corresponding software retained in the mass storage 230 of the DSM- 
server 10. It will be appreciated .that, in ., other embodiments of the 
present invention, sim^ar functionality may be. provided by hardware or 
by a combination of hardware and software. 

Referring back to Figure ,2. .the log files maintained by the 
'^PP^.^^^tion servers ^40-70 are directed to the DSM: server 10 as generally 
represented by communication paths 80,. The log . f . vies are . stored by DSM 
server 10 in mass storage .230, The log file corresponding to each 
application server 40-70 is stored on a separate disk of mass storage 
230. 

Returning to, Figure 7 , .on receipt, of the. log. files , the Monitor 
function 540 of the DSM, server ,10 filters the. .messages contained in each 
log. The fiitered messages are sent from the PSM server 10 to the 
terminal 20 for display to .an administrator as. generally represented in 
Figure 2 by 90. The Server probe function 550 and: the Mail probe function 
560 automatically operate selected ones of the application servers 40-70. 

The integration of the functions 540-570 vithin the DSM server -10 
enables the DSM server 10 to.analyze., message log files produced by the 



Router function. . This process operates as a cc:Mail call list entry 
through which cc:Mail logs are collected and supplied to the DSM server 
10 at predefined intervals. .This enables the DSM server 10 to process 
these log files off-line from the cc:Mail message router service provided 
by application server 40. 

DSM server Functions 

Referring now to Figure 7. in operation, the DSM server 10 acts as 
a mid-level system manager for managing the activities of the application 
servers 40-70. To facilitate such management, the first layer 11 of the 
DSM architecture includes the following Process functions: 

a) a Monitor function 540; 

b) a Server Probe function 550;. 

c) an E Mail Probe function 560; and 

d) a Database Replication Tracking function 570. 

The functions 540^570 are performed by the .CPU ,220 when configured 
by corresponding software retained in the mass storage 230 of the DSM- 
server 10. it will be appreciated . that , in other embodiments of the 
present invention, similar functionality may be. ppcoyided by hardware or 
by a combination of hardware and software. , - 

. Referring back to Figure .2, the log files maintained .by the 
'^PP^^^t^^n^ervers .40-70 are ^d to the DSM. server 10 as generally 

""^^^^^^ by communication paths 80. The log.., f vles are . stored by DSM 
server 10 in mass storage .230., The log file corresponding to each 
application server 40-70 is stored on a separate disk of mass storage 
230. 

Returning to. Figur^ 7, pn receipt, of the. log., files, the Monitor 
function 540 of the DSM. .server 10 filters the .messages, contained in each 
log. The filtered messages are sent from the DSM server 10 to the 
terminal 20 for display to -an, administrator as. generally represented in 
Figure 2 by 90 . The Server probe function SSO and. the Mail probe function 
560 automatically operate selected ones of the application servers 40-70. 



The integration of .the functions 540-570 within the DSM server 10 
enables the DSM server 10 to, .analyze, message log files produced by the 



, the administration terminal 20, thereby allowing administration staff to 
react more swiftly to critical messages. 

Error messages contained in the log files from the application 
servers 40-70 (eg: communications, router, security, resource, and server 
environment error messages) are captured and reported by the DSM server 
10 via Notes mail. Simple Network. Management Protocol (SNMP) trap 
Protocol Data Units (PDOs), and logging to a Notes database. Control of 
the application^servers 40-70 can be passed to predefined user exits with 
the DSM server 10. when the above alerts are processed. 

Each application server 40-70 corresponds to a different dedicated 
disk in the mass storage 230 of the DSM server lO.^ The dedicated disk is 
employed to .record all. information relevant, to the corresponding 
applicati.on server .40-70. Specifically, .the information is organised by 
the DSM server 10 in, a. standard format with sub-directories named DATA, 
STATUS, and REPORTS, 

A summary status, file is created by the DSM server 10 and stored in 
the STATUS sub-directory for access by administration staff wishing to 
review the latest activities of the application servers 40-70. A 
Graphical Us^.?^ I?ter,face,; (GUI) provided by Notes- 4 enables administration 
staff to view all information collected by the DSM server 10 from the 
application servers 40-70 via a database navigator.. 

, : -. ^^'.^.Wt ion. from server -logs -and statistics may be summarised on a 
weekly ,^nd popthly, basis te>,prpvide administrators with information to 
manage present .data process.ingvrequitements and.' plan for' fbture demands . 
The information is held on a Notes database in - the fotm of Notes 
statistics. Qperation system statistics,, network statistics , and response 
time summaries. The application ..server log files and statistics can be 
^rchiyed via the. DSM server 10 .automatically, on a monthly basis to a 
desired, destination. . . 

Server Probe Function 

, server. probe function 550 monitors, via the Notes collection 

agent 912. each Notes application server 50-70 to ensure activation of 
the Notes tasks 913-916 specified in the NOTES.INI file of each Notes 
application , server ,50-70 , TasH^ which :are not ^running (perhaps . due 

to failure) a^e started automatically by the server probe function 550. 
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in addition, the server probe function 550 records and summarises, via 
the Notes collection agent 912 in each Note application server 50-70. 
response times from the DSM server 10 to the Notes application servers 
50-70 on a daily basis. 

If any of the Notes application servers 50-70 goes off-line for any 
reason, the server probe function 550 will raise a severity 1 alert. The 
severity 1 alert is sent by the DSM server 10 to the administration 
terminal 20. The server probe function 550 continuously checks, via the 
Notes collection agent 912, that the specified tasks 913-916 are active 
and functioning correctly. In the event of a problem with any of the 
tasks 913-916, the server probe function 550 will automatically attempt 
to restart it via the Notes collection agent 912. After a predefined 
number of failed attempts to restart a particular task 913-916. the 
server probe function 550 routes an alert to the adminstration' terminal 
30. Failure of the host apF>licition server 50-70 is recorded by the DSM 
server 10 both within and beyond committed service time. 

The server probe function 550 will now be described with reference 
to the flow chart of Figure 8. In operation, the server probe function 
550 issues commands in Notes protocol to a target application server 50- 
70. The commands issued' by the server probe^ function 550 are handled 
within Notes. 910 by the Notes collection agent" 912 of the target 
application servef 50-70. Initially, at block loob, the server probe ■ 
function 550 reset a restart count to zero. Then. ' at block 1010. the 
server probe function 550 s^rfds a "shoW task configuration- command to 
the Notes collection agent 912. The 'show task configuration- command 
captures from the NOTES.INI fiW in' the taiget application server 50-70 
the tasks 913-916 which should be active. At block 1020. the server probe 
function 550 sends a ' show active tasks • command to the target 
application -server 50-7(y. The -sh6w active tasks" command captures the 
tasks. 913^916 which are active bh the target application server 50-70 At 
.block 1030„.the server probe function 550 compares the task configuration 
with the active tasks.: If the task configuration is the same as the 
active tasks, then the server ptobe function 550 terminates at block 
1080, If the task configuration' is different from the active tasks, 
indicating the one or more of thfe tasks 913-916 has failed, then at 
block 10.40, the server probe- function 5S0 determines if the restlrt count 
equals a predetermined threshold of attempts to restart the failed tasks 
If so, then, at block 1040. the server probe function 550 issues an alert 
message for supply to the administration terminal 20. If not, then at 
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block 1050, the server probe function increments the restart count and, 
at block 1060. attempts to restart those tasks specified in the task 
configuration which were reported as inactive. The server probe function 
then continues around the loop defined by blocks 1020, 1030, 1040 1060, 
and 1070 until either all required tasks are active or the threshold is 
exceeded. 

E-Mail Probe Function 

The E Mail probe function 560 tests mail routes in the network of 
Notes application servers 50-70 by measuring the time taken for a test 
message in the form of a Lotus Notes document to complete a return trip 
to a reflecting server 50-70 against predefined thresholds. An example of 
a test report produced by the EMail prpbe function 560 is provided in 
Appendix A hereto, The E Mail probe function 560 generates an alert if a 
threshold is exceeded. Additionally, the E Mail probe function 560 
generates reports including elapsed time across each E Mail application 
server en route. Specifically, the E Mail probe acquires the local date 
and time from each server 50-70 both on entry. and exit. The entry and 
exit date and time for each server are recorded in the Notes document 
forming the test message. In any .mail application, it is important for 
administration staff to know if there are any mail delivery problems and 
the time taken to deliver the ..mail . The E Mail probe function converts 
any problems^ arising in the Notes mail, network into alerts for foi>,arding 
to the adminstration^ termina; 2.Q. The E Mail probe funetioh 560 also' 
*"5°?f*^^^t^^y..5®'^??^«^^?f.MaiL frack^ reports. 

Replicat ion Tracking Function 

In some Notes applications it .is important that data stored in 
databases is shadowed between .different application servers 50-70. Such 
shadowing can be achieved via .the Notes replication:, task 914. The Notes 
replication trackiijg funqtion. 570 of the DSM server' 10 checks databases 
on the application servers 40-70 to establish i:f they are synchronised 
^''^'^ l,^°'^^^."PUcator task 914 has been executed by two or more of the 
Notes application servers 50 tp 70. .If the databases are' out of 
synchronisation. the.DSM server 10 sends an alert to the administration 
terminal 20. The Notes replication tracking function 570 verifies that, 
after Notes replication server activity . has occurred, databases of the ' 
same replica ID have the same contents. 
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block 1050, the server probe function increments the restart count and, 
at block 1060, attempts to restart those tasks specified in the task 
configuration which were reported as inactive. The server probe function 
then continues around the loop defined by blocks 1020. 1030, 1040 1060, 
and 1070 until either -all required tasks are active or the threshold is 
exceeded. 

E-Mail Probe Function 

The E Mail probe function 560 tests mail routes in the network of 
Notes application servers 50-70 by measuring the time taken for a test 
message in the form of a Lotus Notes document to complete a return trip 
to a reflecting server 50-70 against predefined thresholds. An example of 
a test report produced by the E, Mail probe function 560 is provided in 
Appendix A hereto. The E Mail probe function 560 generates an alert if a 
threshold is exceeded. Additionally, the E Mail probe function 560 
generates reports including elapsed time across each E Mail application 
server en route. Specifically, the E Mail probe acquires the local date 
and time from each server 50-70 both on entry and exit. The entry and 
exit date and time for each server are recorded in the Notes document 
forming the test message. In any . mail application, it is important for 
administration staff to toow if there are any mail delivery problems and 
the time taken to deliver the mail. The E Mail probe function converts 
any problems^ arising in .the Notes mail network into alerts for forwarding 
to the adminstration teminal 20. The E Mail probe fundtion 560 also 
.^".^^f^f'^^^t^^Y .5®"??^^^?f Mail, tracking reports . ■• : .. 

Replicat ion Tracking Function . 

In some Notes applications ,, it , is . important that data stored in 
databases is shadowed between .different application servers 50-70. Such 
•shadowing can be achieve^ via^.the Notes replicatiort task 914. The Notes 
replication tracking funptioa 570 of the DSM server 10 checks databases 
on the applicatipn servers 40-70 to establish if they are synchronised 

^,'!°'^^^,"Pl^':^t°^ task 914 has been executed by two or more of the 
Notes applica^on servers . 50 tp 7.0. If the databases are out of 
synchronisation, the, DSM server 10 sends an alert to the administration 
terminal 20. The Notes replication tracking . function 570 verifies that, 
after Notes replication server activity , has occurred, databases of the 
same replica ID have the .isame contents. . 
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By way of example with reference to Figure 1; suppose a database 
ABC.NSF is stored on application server 50 and replicated on application 
server 60 via the replication tracking task 914. Hence, both application 
server 50 and application server 60 hence store a copy a ABC.NSF. Suppose 
now that a client user connected to application server 50 is modifying 
the copy of ABC.NSF stored on application server 50 and, simultaneously, 
a client user connected to application server 60 is modifying the copy of 
ABC.NSF stored on application server 60. The replication task 914 on 
application 50 periodically replicates the modified copy of ABC.NSF on 
application server 60. Likewise, the replication task 914 on application 
60 periodically replicates the modified copy of ABC.NSF on application 
server 50. The frequency at which replication takes place can be preset 
according to user needs. For example, if the database contain relatively 
important information which is frequently modified by client users, then 
correspondingly frequent replication activity might be appropriate. 
Conversely, if the information contained in the database in less 
important, replication may be- set to take place less frequently. From a 
systems management perspective, it would be desirable to ensure that 
replication tasks on application servers 50 and 60 are set to perform 
replication of ABC.NSF at sufficient frequency to accommodate the 
regularity with which end user clients of application servers 50 and 60 
independently modify ABC.NSF. Thi s-probl em is solved by the replication 
tracking function 570 of the DSM server 10." 

Referring to Figure .9, in operation, the replication tracking 
function 570 is initialised, at block 1110. by setting a SAMPLING 
INTERVAL to the desired nuir^r of samples of .the copies of ABC.NSF on 
application servers 50 and 60 to be taken; by setting a TARGET COUNT to 
the number of matches in the copies of ABC.NSF on application servers 50 
and 60. to be found in the sampling interval; and, by resetting running 
totals HIT COUNT and SAMPLE COUNT to zero. At block 1110, the replication 
tracking function 570 samples, both ABC.NSF stored on application 50 and 
ABC.NSF stored on application server 60. At block 1120, the replication 
tracking function 570 compares the two samples. If the two samples match 
then, at block 1130 the replication tracking function 570 increments HIT 
COUNT and progresses to block 1140. If the two samples do not match, then 
the replication tracking function progresses directly to block 1140 at 
which SAMPLE COUNT is. incremented. At block 1150, the replication 
tracking function 570 compares SAMPLE COUNT with SAMPLING INTERVAL. If 
SAMPLE COUNT does not equal SAMPLING INTERVAL, then the replication 
tracking function 570 returns to block 1110 to collect the next pair of 



samples. If the SAMPLE COUNT has reached SAMPLING INTERVAL, then the 
.replication tracking function 570 compares, at block 1160, HIT COUNT with 
TARGET COUNT. If HIT COUNT is less than TARGET COUNT then, at block 1170, 
the replication tracking fvinction 570 issues an alert to the 
administration terminal indicating that the copies of ABC.NSF stored on 
application server 50 and 60 are not synchronised. Otherwise, the 
replication tracking function 570 terminates. 

It will be appreciated that the replication tracking function 570 
may be employed to track replication of more than one database. Equally, 
it will be appreciated that the replication tracking function 570 may he 
employed to track replication of more than two copies of the or each 
database. Furthermore, it will be appreciated that the replication 
tracking function 570 may apply different tfest parameters (eg: SAMPLING 
INTERVAI,., TARGET. COUNT) to: different databases or groups of databases. 
Still , furthermore,, it will be , appreciated that the samples forming the 
sampling interval may.be taken over a relatively short period of time or 
over a relatively long period of. time depending on customer requirements. 

Applica tion Server Loo Data Storacj A 

, Because, .in embodiments of the present invent ion, the log files are 
retained by the mass storage 230 of the DSM server 10 rather than by the 
application servers 40-70, the application servers 40-70 are able to 
dedic^ate more resource to. client activities. 

DSM Ser ver Hierarchy and Scalability 

. The. DSM. server 10 can operate ^.s a riodule. Therefore, multiple DSM 
servers may be .employed to accotnmddate a' largfe number of different 
application servers, with each DSM ^'server setVih^^ group of 

application .•rver...-Refrrin, rt^wto Ti^^i'^^ ^^^^ of such an 
.arrangement, there is. provided a plurality of application servers 400-450 
and a plurality of JDSM servers 460-480'. E&ch DSM server 460-480 is 
connected.tp ^. different, group of -applicatibn' servers 400-450. The DSM 
«*"'«f?.r*Vttr^.70 «re each- connected to a DSM ii>aker server 490. In 
op^ratlop, all MTA communications with the api,lication servers 400-450 
are handled by the, DSM servers 460-480. but the DSM servers 460-480 
crqrnmunicate with the master DSM server 490 via the N^tesOTA alone. In 
the..example of. the present invention hereinbef ore" described with 
referenc^.to Figure a, there are 3 DSM servers 460-460. However, it will 
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be appreciated that, in other embodiments of the present invention, there 
may be only two DSM servers reporting to the master DSM server 490. 
Likewise, in other embodiments of the present invention, there may be 
■ grater than three DSM servers either reporting to j the master DSM server 
490 or to one or more further layers of intermediate DSM servers 
arranging in a hierarchical structure ending at the master DSM server 
490. 

Typically, operational centres are consolidated into one or two 
geographical areas in the interests of cost. This means that from 
relatively few operational centres, systems management control is 
exercised over relatively large regions. The arrangement shown in Figure 
5 is particularly suitable for this scenario. 

By way of summary then, what has been hereinbefore described by way 
of example of the present invention is a server probe for use in a 
distributed data processing system having plural application server 
computer systems interconnected via a network wherein each application 
server has a database application including a plurality of executable 
data management tasks and an initialisation table storing a task 
configuration indifcatave of which ones of the tasks should be active. The 
; server probe reads the task configuration from the initialisation table 
from a target one of the application servers, -identified the tasks which 
are active in the target one. of the application servers, and generates an 
event message if active tasks identified differ from the tasks specified 
in the- initialiisation. table r . ' ■ •■, 
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APPENDIX A 

bSM Mail Probe Status Information for 
D06ML002/06/M/IBM. . . 
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Mail Probe Reflector Name 


Joe 

Publ ic/UK/ IBM9 IBMGB@GBLPO 0 0 0 




Time Taken For Mail Probe to Return 


957 seconds 


10 


Number of Server (s) Mail Probe Was 
Routed Through 


12 


1 
1 


Domains That The Mail Probe Passed 
Through 


IBMGB, UKGNAMIG, GBLPOOOO 



i De taile d Information 



>> denotes times for Mail Probe on its journey to the Reflector 
<<^denotes times for Mail Probe on its journej back from tSeRefieet or 



20 



25 



30 



Notes Server NaiDG 

GBLPR403/GBLPR4 
;GBLPR401/GBLPR4 



GBMP0028/LCS 



DSMLN003/.UXGNAMIG 



D06HUBM1/06/H/IBM 



D06ML002/06/M/ABM 



D06ML002/06/M/IBM 
D06HUBM1/06/H/IBM 



DSMLNCC3/VKGNAMIG 

BMP0028/LCS 
GBLPR401/GSLPR4 
GBLPR403/GBLPR4 



Tiae Hail Probe 
Entered HXIL.BOX On 
Server 

12-02-97 04:34:24 
PM 

12-02-97 04:37:52 
PM 



12-02-97 04:47:12 



12-02-97 04:50:47 
PM 



12-02-97 04:47:35 
PM 



12-02-97 05:51:14 
PM 



12-02-97 05:51:15 
PM 

12-02-97 04:47:37 
PM 



12-C2-97 04:50:51 
PM 

12-02-97 04:47:21 
?M 

12-02-97 04:53:44 
PM 

12-02-97 04:50:21 
PM 



Tlae Mall Probe 
Left KAIL. BOX On 
Server 

12-02-97 .04 :34:25 
PM 

12-02-97 04:53:35 
PM . . 



12-02r97 04..- 47: 15 
PM 



12-02-97 04:50:49 

PM . ^ 



12-02-97 04:47:36* 
PM 



12-02-97 05:51:15 
PM 



12-02-97 05:51:15 
PM 

12-02-97 04:47:38 
PM , 



12-C2-97 C4:50:53 
PM 

12-02-97 04:47:22 
PM 

12-02-97 04:53:48 
PM 

12-02-97 04:50:21 
PM 



Time Hall Probe 
Spent At Server 



1 second (s) .» 
943 second (s) .» 



3 second (s) ,» 



2 second (s) .» 



1 second (s) .» 



1 second (s).» 



0 S€cond(s) .« 

1 second (s) .« 



2 second(s} .<< 
1 second (s) .« 
4 secondts) .<< 
0 S€cond(s) .« 
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CLAIMS 



1. Server probe apparatus for a distributed data processing system 
having plural application server computer systems interconnected via a 
network, each application server having a database application including 
a plurality of executable data management tasks and an initialisation 
table storing a task configuration indicative of which ones of the tasks 
should be active, the Apparatus comprising: 

means for reading the task configuration from the initialisation 
table from a target one of the application servers; 

means for identifying tTie tasks which are active in the target one 
of the application servers; and 

means for generating an event message if active tasks identified 
differ from the tasks specified in the initialisation table. 

2. Apparatus as claimed in claim 1, wherein the generating means 
comprises restart means for making successive attempts to restart tasks 
which are specified in the initialisation table but inactive in the 
target one of the application servers and means for generating the event 
message after a predetermined plurality of failed attempts by the restart 
means to restart the or each inactive task. 

3. Apparatus as claimed in claim 1 or claim 2, comprising an 
administration terminal having means for displaying the event messages . 

4. A distributed data processing system having plural application 
server computer systems interconnected via a network, each application 
server having a database application including a plurality of executable 
data management tasks and an initialisation table storing a task 
configuration indicative of which ones of the tasks should be active, and 
server probe apparatus as claimed in any preceding claim. 

5 . A method for managing data management tasks in a distributed data 
processing system having plural application server computer systems 
interconnected via a network, each application server having a database 
application including a plurality of executable data management tasks and 
an initialisation table storing a task configuration indicative of which 
ones of the tasks should be active, the method comprising the steps of: 
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reading the task configuration from the initialisation table from . 
target one of the application servers; 

, . identifying the tasks which are active in the target one of the 
application servers; and 

generating an event message If active tasks identified differ from 
the tasks specified in the initialisation table. 

6.^ A method as claimed in claim 5, wherein the, generating step 
comprises: 



making successive attempts to restart tasks which are specified in 
the initialisation table but inactive in the , target one of the 

application servers; and, 

• generating the event message after a predetermined plurality of 
failed attempts to restart the or each inactive task. 

7. A method as claimed in claim 5 oi- r-iai™ c • ■ 

, . ^ °^ 6. comprising displaying 

the event message on a display. device. . 
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